Feature Selection for Short Text Classification using Wavelet Packet Transform

نویسندگان

  • Anuj Mahajan
  • Sharmistha Jat
  • Shourya Roy
چکیده

Text classification tasks suffer from curse of dimensionality due to large feature space. Short text data further exacerbates the problem due to their sparse and noisy nature. Feature selection thus becomes an important step in improving the classification performance. In this paper, we propose a novel feature selection method using Wavelet Packet Transform. Wavelet Packet Transform (WPT) has been used widely in various fields due to its efficiency in encoding transient signals. We demonstrate how short text classification task can be benefited by feature selection using WPT due to their sparse nature. Our technique chooses the most discriminating features by computing inter-class distances in the transformed space. We experimented extensively with several short text datasets. Compared to well known techniques our approach reduces the feature space size and improves the overall classification performance significantly in all the datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Accurate Fault Classification of Transmission Line Using Wavelet Transform and Probabilistic Neural Network

Fault classification in distance protection of transmission lines, with considering the wide variation in the fault operating conditions, has been very challenging task. This paper presents a probabilistic neural network (PNN) and new feature selection technique for fault classification in transmission lines. Initially, wavelet transform is used for feature extraction from half cycle of post-fa...

متن کامل

Classification of the mechanomyogram signal

Previous works have resulted in some practical achievements for mechanomyogram (MMG) to control powered prostheses. This work presents the investigation of classifying the hand motion using MMG signals for multifunctional prosthetic control. MMG is thought to reflect the intrinsic mechanical activity of muscle from the lateral oscillations of fibers during contraction. However, external mechani...

متن کامل

Texture Classification of Diffused Liver Diseases Using Wavelet Transforms

Introduction: A major problem facing the patients with chronic liver diseases is the diagnostic procedure.  The conventional diagnostic method depends mainly on needle biopsy which is an invasive method. There  are  some  approaches  to  develop  a  reliable  noninvasive  method  of  evaluating  histological  changes  in  sonograms. The main characteristic used to distinguish between the normal...

متن کامل

Classification of transient time-varying signals using DFT and wavelet packet based methods

The classification of transient time-varying signals is important for industrial, biomedical and military applications. The attack phase of piano sounds is used as an example for transient, time-varying signals in a real data application. Discrete Fourier transform and time-invariant wavelet packet based algorithms are used alternatively for feature extraction. The training set is used for dete...

متن کامل

Time-frequency Representation for Classification of the Transient Myoelectric Signal

An accurate and computationally efficient means of classifying myoelectric signal (MES) patterns has been the subject of considerable research effort in recent years. Effective feature extraction is crucial to reliable classification and, in the quest to improve the accuracy of transient MES pattern classification, many forms of signal representation have been suggested. It is shown that featur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015